Regularization Through Feature Knock Out

نویسندگان

  • Lior Wolf
  • Ian Martin
چکیده

In this paper, we present and analyze a novel regularization technique based on enhancing our dataset with corrupted copies of the original data. The motivation is that since the learning algorithm lacks information about which parts of the data are reliable, it has to produce more robust classification functions. We then demonstrate how this regularization leads to redundancy in the resulting classifiers, which is somewhat in contrast to the common interpretations of the Occam’s razor principle. Using this framework, we propose a simple addition to the gentle boosting algorithm which enables it to work with only a few examples. We test this new algorithm on a variety of datasets and show convincing results. Copyright c ©Massachusetts Institute of Technology, 2004 This report describes research done at the Center for Biological & Computational Learning, which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL). This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA972-04-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation (ITR/IM) Contract No. IIS-0085836, National Science Foundation (ITR/SYS) Contract No. IIS-0112991, National Science Foundation (ITR) Contract No. IIS-0209289, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218693, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of Health (Conte) Contract No. 1 P20 MH66239-01A1. Additional support was provided by: Central Research Institute of Electric Power Industry, Center for e-Business (MIT), Daimler-Chrysler AG, Compaq/Digital Equipment Corporation, Eastman Kodak Company, Honda R& D Co., Ltd., ITRI, Komatsu Ltd., Eugene McDermott Foundation, Merrill-Lynch, Mitsubishi Corporation, NEC Fund, Nippon Telegraph & Telephone, Oxygen, Siemens Corporate Research, Inc., Sony MOU, Sumitomo Metal Industries, Toyota Motor Corporation, and WatchVision Co., Ltd.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of the Economic Nature of the Barrier Options and Its Jurisprudential Analysis

The purpose of this study is to investigate the economic and jurisprudential nature of barrier Option. Options are a type of derivative instrument in the financial markets that gives a person the right to buy or sell an asset without obligation. This tool is used along with other types of derivative tools to cover risk and speculation. Two kindes of barrier option are the Knock-In and Knock-out...

متن کامل

Feature Scaling for Kernel Fisher Discriminant Analysis Using Leave-One-Out Cross Validation

Kernel fisher discriminant analysis (KFD) is a successful approach to classification. It is well known that the key challenge in KFD lies in the selection of free parameters such as kernel parameters and regularization parameters. Here we focus on the feature-scaling kernel where each feature individually associates with a scaling factor. A novel algorithm, named FS-KFD, is developed to tune th...

متن کامل

From Transformation-Based Dimensionality Reduction to Feature Selection

Many learning applications are characterized by high dimensions. Usually not all of these dimensions are relevant and some are redundant. There are two main approaches to reduce dimensionality: feature selection and feature transformation. When one wishes to keep the original meaning of the features, feature selection is desired. Feature selection and transformation are typically presented sepa...

متن کامل

Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity

We present a non-linear, simple, yet effective, feature subset selection method for regression and use it in analyzing cortical neural activity. Our algorithm involves a feature-weighted version of the k-nearest-neighbor algorithm. It is able to capture complex dependency of the target function on its input and makes use of the leave-one-out error as a natural regularization. We explain the cha...

متن کامل

Stationarity of Matrix Relevance Learning Vector Quantization

We investigate the convergence properties of heuristic matrix relevance updates in Learning Vector Quantization. Under mild assumptions on the training process, stationarity conditions can be worked out which characterize the outcome of training in terms of the relevance matrix. It is shown that the original training schemes single out one specific direction in feature space which depends on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004